Smoothed Dual Embedding Control

نویسندگان

  • Bo Dai
  • Albert Shaw
  • Lihong Li
  • Lin Xiao
  • Niao He
  • Jianshu Chen
  • Le Song
چکیده

We revisit the Bellman optimality equation with Nesterov’s smoothing technique and provide a unique saddle-point optimization perspective of the policy optimization problem in reinforcement learning based on Fenchel duality. A new reinforcement learning algorithm, called Smoothed Dual Embedding Control or SDEC, is derived to solve the saddle-point reformulation with arbitrary learnable function approximator. The algorithm bypasses the policy evaluation step in the policy optimization from a principled scheme and is extensible to integrate with multi-step bootstrapping and eligibility traces. We provide a PAC-learning bound on the number of samples needed from one single off-policy sample path, and also characterize the convergence of the algorithm. Finally, we show the algorithm compares favorably to the state-of-the-art baselines on several benchmark control problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Smoothed Complexity Theory

Smoothed analysis is a new way of analyzing algorithms introduced by Spielman and Teng (J. ACM, 2004). Classical methods like worst-case or average-case analysis have accompanying complexity classes, like P and Avg-P, respectively. While worst-case or average-case analysis give us a means to talk about the running time of a particular algorithm, complexity classes allows us to talk about the in...

متن کامل

On Generalized Injective Spaces in Generalized Topologies

In this paper, we first present a new type of the concept of open sets by expressing some properties of arbitrary mappings on a power set. With the generalization of the closure spaces in categorical topology, we introduce the generalized topological spaces and the concept of generalized continuity and become familiar with weak and strong structures for generalized topological spaces. Then, int...

متن کامل

Margins, Kernels and Non-linear Smoothed Perceptrons

We focus on the problem of finding a non-linear classification function that lies in a Reproducing Kernel Hilbert Space (RKHS) both from the primal point of view (finding a perfect separator when one exists) and the dual point of view (giving a certificate of non-existence), with special focus on generalizations of two classical schemes the Perceptron (primal) and Von-Neumann (dual) algorithms....

متن کامل

Embedding Lagrangian Sink Particles in Eulerian Grids

We introduce a new computational method for embedding Lagrangian sink particles into an Eulerian calculation. Simulations of gravitational collapse or accretion generally produce regions whose density greatly exceeds the mean density in the simulation. These dense regions require extremely small time steps to maintain numerical stability. Smoothed particle hydrodynamics (SPH) codes approach thi...

متن کامل

An Inexact Perturbed Path-Following Method for Lagrangian Decomposition in Large-Scale Separable Convex Optimization

This paper studies an inexact perturbed path-following algorithm in the framework of Lagrangian dual decomposition for solving large-scale separable convex programming problems. Unlike the exact versions considered in the literature, we propose to solve the primal subproblems inexactly up to a given accuracy. This leads to an inexactness of the gradient vector and the Hessian matrix of the smoo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1712.10285  شماره 

صفحات  -

تاریخ انتشار 2017